

# Fault Tolerant Techniques for FPGAs: A Review

Raghunath B.H<sup>1</sup>, Dr Aravind H S<sup>2</sup>

<sup>1</sup>Department of ECE, AIT, Bengaluru
raghunathbh@gmail.com

<sup>2</sup>Department of ECE, JSSATE, Bengaluru
aravindhs1@gmail.com

Abstract—Field-Programmable Gate Arrays (FPGAs) have emerged as best option for digital circuit implementation over the last few decades. FPGAs have the ability to reconfigure at runtime; therefore provide opportunities to overcome issues like reliability and availability which are the serious issues in safety critical applications. This review attempts to investigate some of popular methods in fault detection and also gives an overview of partial reconfiguration technique in FPGA based systems.

Index Terms— FPGA, single event upsets (SEUs), SRAM, Look Up Table (LUT), Triple Modular Redundancy (TMR), Application Specific Integrated Circuits (ASIC).

## I. INTRODUCTION

Field-Programmable Gate Arrays (FPGAs) are electrically programmable silicon devices that can be used for implementing different kinds of digital systems such as DSP and network based applications. SRAM based FPGAs are used in hardware of safety critical applications, which are easily affected by radiations. FPGAs have become more vulnerable to faults like any other IC chips. SRAM based FPGAs are prone to both transient and permanent faults. Faults may occur anywhere in the device. Fault Tolerant system design involves detection of fault, diagnosis and correction of fault. But FPGA requires more area than a standard cell ASIC, has a speed performance slower than an ASIC and consumes more dynamic power. These disadvantages overcome by one major advantage which is ability to reconfigure at runtime. This review paper is structured as follows: segment 2 gives a brief overview of FPGA's architecture, segment 3 describes classification of faults and different faults Models used, segment 4 include previous related work done and segment 5 include conclusion.

## II. FPGA ARCHITECTURE

FPGAs composed of a number of programmable resources which can be configured to implement any desired logical function. Fig 1 shows basic architecture of FPGA. FPGAs consist of array of programmable logic blocks, memory blocks and a multiplier blocks surrounded by programmable routing interconnects. In FPGAs, the configuration can be done at runtime; i.e. only specific parts of the FPGA can be configured without disturbing other areas from their operation. This ability is a result of partial runtime reconfiguration, Partial Reconfiguration (PR) provides a high degree of flexibility and efficiency in FPGAs.

#### III. FAULT TYPES, MODELS AND TOOLS

#### A. Fault Types:

The different types of faults that occur in FPGA-based system are categorised as follows:

- Aging faults: These are the faults occurred because of components degradation with aging.
- Manufacture Faults: stuck-at 0 or stuck at 1 type faults results from manufacturing defects.
- Single Event Upsets (SEUs) and Single Event Transients (SETs): when an energetic particle like
  neutron or proton collides an atom in the silicon structure, causing memory bits to alter or flips it
  content resulting malfunctioning of system.
- Software Faults:

#### B. Fault Models:

Fault models are essential for generating and evaluating test vectors. They are

- Stuck- at fault model .this may be Stuck at 0 / Stuck at 1
- Transistor stuck-on/off fault model.
- Wire open /short on interconnect.
- Delay fault model.

#### C. Tools:

Some of the popular tools used in FPGA systems are

- Placement and Routing Tools
- Readback
- Scrubbing
- Dynamic Partial Reconfiguration.

### IV. RELATED PREVIOUS WORK DONE

In order to mitigate the effects of SEUs in FPGAs memory, many fault tolerant methods have been proposed over last few decades. These methods can be grouped into two types. (i) Based on Reconfiguration, where whenever SEU occurred, it attempt to restoring the proper values into configuration bits [5], (ii) Redundancy-based which focuses at masking the fault propagation to circuit's output [6]. Fault masking is done using Triple Module Redundancy (TMR) method, where three similar duplications of the system done in parallel and outputs of them are compared and majority voter gives the majority of three systems output as shown in Fig 2



Figure 2. TMR method with Majority Voter

Memory blocks, routing blocks, and logic blocks are all vulnerable to SEUs and, thus, redundancy must be employed to overcome. The potential of the TMR method to mitigate SEUs were tested through experiments using simulation tools in earlier days [9].

Reconfiguration of memory blocks can be done using either by fine-grained or coarse-grained method. Fine-grained re-configuration is achieved by Look-Up Tables, and implementation is done with RTL Coarse-grained re-configurability is achieved by different functional units. Fine-grained method is less efficient as it consume large area overhead and causing poor *routability*. Coarse-grained method is favourite as width of path is greater than 1 bit.

Reiner Hartenstein et al[12] reviewed various architectures for coarse grain reconfigurable hardware methods which mentions architectures based on Mesh, cross bar based architectures, based on linear arrays architectures, etc.

A number of applications utilizing reconfigurable hardware and some example systems used in these applications have been discussed [14]. Reconfigurable hardware design issues which are critical to embedded system designers are also covered in the literature.

L.Sterpone et.al proposed a [18]novel technique for fault tolerant systems for SRAM based FPGAs which carry out s a set of experiments utilizing their methodology and subsequently Compares the results with existing solutions. First this technique focuses on the detection of error caused by a fault throughout the operation of system, it also identified location of such fault, quickly recovered faults to bringing the system back into the proper operation.

S. Martin et.al [19] proposed the fault tolerant design for SRAM based on FPGA using partial dynamic reconfiguration. Fig 3, shows typical block diagram of Partial dynamic reconfiguration technique. This Technique allows reconfiguration of only faulty module without affecting the operation of other module in runtime.



Figure 3. Partial Reconfiguration Technique

## V. CONCLUSION

Many Safety critical applications demands SRAM based FPGAs because of their high-throughput capabilities and less cost. However, SRAM-based devices, are subjected to radiation affect like Single Event Upsets (SEUs) and Single Event Transients (SETs). In this review, various fault tolerant techniques for FPGA based systems has been presented along with classification of faults and popular tools. Partial reconfiguration based Fault tolerant technique is also briefly explained. Safety critical applications choose SRAM based FPGAs as primary option because of their reconfigurable ability in runtime, reprogrammability, and low system development costs.

#### REFERENCES

- [1] Zhang, Hongyan, Lars Bauer, Michael A.Kocht e, Eric Schneider, Claus Braun, Michael E. Imhof, Hans-Joachim Wunderlich, and Jorg Henkel. "Module diversification: Fault tolerance and aging mitigation f or runt ime reconf igurable architectures", 2013 IEEE Int ernational Test Conference (ITC), 2013.
- [2] L. Vavousis, A. Apostolakis, M. Psarakis, "A Fault Tolerant Approach for FPGA Embedded Processors Based on Runtime Partial Reconfiguration," J Electron Test, Springer, New York, 2013.
- [3] M. Psarakis, A. Apostolakis, "Fault Tolerant FPGA Processor Based on Runtime Reconfigurable Modules," 17th IEEE European Test Symposium (ETS), 2012.
- [4] P. Graham, M. Caffrey, D.E. Johnson, N. Rollins, and M. Wirthlin, "SEU Mitigation for Half-Latches in Xilinx Virtex FPGAs," IEEE Trans. Nuclear Science, vol. 50, no. 6, pp. 2139-2146, Dec. 2003.

- [5] C. Carmichael, M. Caffrey, and A. Salazar, "Correcting Single-Event Upset through Virtex Partial Reconfiguration," Xilinx Application Notes, XAPP216, 2000.
- [6] F. Lima Kanstensmidt, G. Neuberger, R. Hentschke, L. Carro, and R. Reis, "Designing Fault-Tolerant Techniques for SRAM-Based FPGAs," IEEE Design and Test of Computers, pp. 552-562, Nov./ Dec. 2004.
- [7] F. Lima, L. Carro, and R. Reis, "Designing Fault Tolerant System into SRAM-Based FPGAs," Proc. IEEE/ACM Design Automation Conf., pp. 650-655, June 2003.
- [8] S. Habinc Gaisler Research, "Functional Triple Modular Redun-dancy (FTMR) VHDL Design Methodology for Redundancy in Combinational and Sequential Logic," www.gaisler.com, 2002.
- [9] N. Rollins, M.J. Wirthlin, M. Caffrey, and P. Graham, "Evaluating TMR Techniques in the Presence of Single Event Upsets," Proc. Military and Aerospace Programmable Logic Design (MAPLD 2003),2003.
- [10] M. Bellato, P. Bernardi, D. Bortolato, A. Candelori, M. Cerchia, A.Paccagnella, M. Rebaudengo, M. Sonza Reorda, M. Violante, and P. Zambolin, "Evaluating the Effects of Seus Affecting the onfiguration Memory of an SRAM-Based FPGA," Proc. IEEE Design Automation and Test in Europe, pp. 188-193, 2004.
- [11] Anupam Chattopadhyay, "Ingredients of Adaptability: A Survey of Reconfigurable Processors", Hindawib Publishing Corporation, VLSI Design, Volume 2013, Article ID 683615.
- [12] Reiner Hartenstein, "A Decade of Reconfigurable Computing: a Visionary Retrospective", DATE'01 Proceedings of the conference on Design, automation and test in Europe, IEEE Press Piscataway, NJ,USA ©2001.
- [13] K. Karuri, A. Chattopadhyay, S. Kraemer, R. Leupers, G. Ascheid, H. Meyr, "A Tool Flow for Design Space Exploration of Partially Re-con\_gurable Processors", Integrated Signal Processing Systems, RWTH Aachen University 52056 Aachen, Germany.
- [14] Philip Garcia, Katherine Compton, Michael Schulte, Emily Blem, and Wenyin Fu, "An Overview of Reconfigurable Hardware in Embedded Systems", Hindawi Publishing Corporation, EURASIP Journal on Embedded Systems, Volume 2006, Article ID 56320.
- [15] F. Lima Kanstensmidt, G. Neuberger, R. Hentschke, L. Carro, R. Reis, Designing FaultTolerant Techniques for SRAM-Based FPGAs, IEEE Design and Test of Computers, Nov.—Dec. 2004, pp. 552–562.
- [16] Massimo Violante. "A new placement algorithm for the optimization of fault tolerant circuits on reconfigurable devices", Proceedings of the 2008 workshop on Radiation effects and faulttolerance in nanometer technologies – WREFT 08 WREFT 08, 2008
- [17] B. Harikrishna, S. Ravi, "A Survey on Fault Tolerance in FPGAs," 978-1-4673-4603-0/12/\$31.00, IEEE, 2012.
- [18] L. St erpone. "A New Reliabilit y-Orient ed Place and Route Algorithm for SRAM-Based FPGAs", IEEE Transact ions on Computers, 6/2006
- [19] M. Straka, J. Kastil, and Z. Kotasek, "Modern fault tolerant architectures based on partial dynamic reconfiguration in fpgas," in 13th IEEE International Symposium on Design and Diagnostics of Electronic Circuits and Systems. New York, NY, USA: IEEE Computer Society, 2010, pp. 336–341.vol. 25, no. 6, pp. 30–39, 2005
- [20] Jamuna et al, "Fault Tolerant Techniques for Reconfigurable Devices: a brief Survey" ijaiem, Volume 2, Issue 1, 2013.